Geo-Temporal retrieval filtering versus answer resolution using Wikipedia

نویسندگان

  • Jorge Machado
  • José Luis Borbinha
  • Bruno Martins
چکیده

We describe an evaluation experiment on GeoTemporal Document Retrieval created for the GeoTime evaluation task of NTCIR 2011. This work describes the retrieval techniques developed to accomplish this task. We describe the collections used in the workshop, detailing the composition of the collections in terms of geographic and temporal expressions. The first contribution of this work is the collections’ statistics, which by itself reveals the relevance of this subject. Our parsing techniques found millions of references related with the dimensions of relevance time and space. Those references were used to index the documents in order to score them in those dimensions. We also introduce a technique to find extra references in Wikipedia using Google Search Service and the same parsers used in the collections. Those references were used in four different scenarios depending on the queries: first we used the references found in topics to filter documents without geographic or temporal expressions and used pseudo relevance feedback to expand topics with no references using the indexes created for places and dates; in other approach we used the Wikipedia references to filter documents from the result set, in a last approach we expanded all topics with the Wikipedia references. Finally we used another technique based on metric distances calculated through coordinates (latitudes and longitudes) and dates in order to create a scope for documents and topics, and rank them according to the distance between each other.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Preliminary Experiments with Geo-Filtering Predicates for Geographic IR

This paper describes a set of experiments for monolingual English retrieval at GEO-CLEF 2005. We evaluate a technique for spatial retrieval based on named entity tagging, toponym resolution, and re-ranking by means of geographic filtering. To this end, we present a series of systematic experiments in the Vector Space paradigm. We investigate plain bag-of-word versus a kind of phrasal retrieval,...

متن کامل

Are Passages Enough? The MIRACLE Team Participation in QA@CLEF2009

Abstract This paper summarizes the participation of the MIRACLE team in the Multilingual Question Answering Track at CLEF 2009. In this campaign, we took part in the monolingual Spanish task at ResPubliQA@CLEF 2009 and submitted two runs. We have adapted our QA system which has been evaluated in EFE and Wikipedia to the new JRC-Acquis collection and the legal domain. We tested the use of answer...

متن کامل

Experiments with Geo-Temporal Expressions Filtering and Query Expansion at Document and Phrase Context Resolution

Collection Processing Statistics We describe an evaluation experiment on GeoTemporal Document Retrieval created for the GeoTime evaluation task of NTCIR 2010. GeoTemporal Retrieval aims at to improve retrieval results using Geographic and Temporal dimensions of relevance. To accomplish that task, systems need to extract geographic and temporal information from the documents, and then explore se...

متن کامل

Generalizing from Freebase and Patterns using Cluster-Based Distant Supervision for TAC KBP Slotfilling 2012

For the slot filling task of TAC KBP 2012 we extended last year’s system in several respects. The core of the system is a set of semisupervised per-relation classifiers, trained by a scheme known as distant supervision. Training data are generated by using Freebase and applying patterns. Relation models rely on (1) word clusters generalizing from context surface forms and (2) additional argumen...

متن کامل

University of Pittsburgh at GeoCLEF 2008: Towards Effective Geographic Information Retrieval

This paper reports University of Pittsburgh’s participation in GeoCLEF 2008. As the first time participants, we only worked on the monolingual GeoCLEF task and submitted four runs under two different methods. Our GCEC method aims to test the effectiveness of our online geographic coordinate extraction and clustering algorithm, and our WIKIGEO method wants to examine the usefulness of using the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011